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LARGE-SCALE  COMPUTER-AIDED  STATISTICAL  MATHEMATICS 


Peter  A.  W.  Lewis 
Naval  Postgraduate  School 
Monterey,  California 


Abstract 


Some  thoughts  on  large-scale  computer-aided  statistical  mathematics  (primarily 
simulation)  which  were  presented  at  the  6th  Annual  Conference  on  the  Computer 
Science/Statlstlcs  Interface  conference  are  presented.  Comments  of  participants 
and  panelists  (D.  F.  Andrews,  J.  N.  Arvesen,  D.  P.  Gaver,  and  G.  Marsaglia)  have 
been  added  to  the  original  text. 


1.  INTRODUCTION 

The  aim  of  this  paper  is  to  stimulate  discuosion 
on  large-scale  computer-aided  statistical  mathe¬ 
matics  (primarily  simulation)  at  this  conference. 
There  has  been  a  lot  of  discussion  of  computers 
and  statistics  (Hartley,  1972;  Milton  and  Nelder, 
1969;  Chambers,  1970),  but  little  of  large-scale 
use  of  computers  In  simulation  experiments  to 
solve  open  distributional  problems.  This  has  per- 
hafs  been  because  of  the  unavailability  of  large 
crmputers  and  large  amounts  of  computer  time  to 
research  workers  and  statisticians.  I  think  this 
v’i  11  change  rapidly  over  the  next  ten  years  as 
Internal  computation  speed  and  the  size  of  random 
access  memories  go  up.  The  talk  given  by  Dr.  A. 

G.  Anderson  at  this  conference  has  amply  illus¬ 
trated  this  trend. 

To  take  advantage  of  this  availability,  and  to  use 
the  .nternal  time  sharing  of  central  processing 
units  inherent  in  multiprogramming,  new  statistical 


techniques  which  are  computation-ori er  ed  will  have 
to  be  developed.  There  is  already  growing  impetus 
in  this  direction  and  this  new  technology  coupled 
to  the  computers  will  make  an  enormous  impact  on 
statistics.  I  should  note,  too,  that  large-scale 
simulations  are  commonplace  in  industry  and  devel¬ 
opment  laboratories,  but  the  inefficiency  of  most 
of  these  computations  is  appalling. 

There  are  recent  surveys  of  several  aspects  of 
statistical  computing  (Hemmerle,  1967;  Halton, 

1970;  Chambers,  1970;  Freiberger  and  Grenander, 
’.971),  most  notably  that  by  Tukey  (1972b)  who  has 
been  responsible  for  many  of  the  new  ideas  in  sta¬ 
tistical  computation.  Consequently,  I  will  only 
describe  here  the  evolution  of  a  computer  program 
called  COMPSTAT  which  was  developed  to  try  to  use 
the  IBM  360/91  computet  at  the  IBM  Research  Center 
as  efficiently  as  possible.  The  problems  encoun¬ 
tered  in  developing  this  program,  some  solved  but 
others  open,  are  more  than  enough  for  one  paper. 


•Sponsored  by  the  Office  of  Naval  Research  through  Contract  NR  042-288  and  the 
Foundation  Research  Program  at  the  Naval  Postgraduate  School. 
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My  interest,  in  the  problem  of  large-scale 
statistical  computation  grew  from  Che  frustration 
of  trying  to  deal  tith  non-normal  time  scries,  in 
particular,  point  processes  (Cox  and  Lewis,  1966), 
and  of  having  to  write  a  book  around  the  many  gaps 
in  the  distribution  theory.  The  first  problem  I 
tackled  was  the  distribution  of  product-moment 
statistics  (Lewis  and  Goodman,  1970),  since  one 
can,  in  principle,  find  recursion  relationships 
to  generate  the  distribution  for  successive  sample 
sizes  n.  It  took  six  months  to  verify  the 
mathematics,  six  months  to  program  it,  and  even 
then  I  wasn't  sure  enough  of  the  programming  to 
publish  the  results.  i  then  turned  to  simulation, 
and  quickly  ran  into  several  equally  frustrating 
problems : 

(1)  Many  procedures,  notably  variance  re¬ 
duction  techniques,  were  very  particular 
to  the  problem  at  hand  and  difficult  to 
generalize.  For  example,  a  technique 
which  works  In  estimating  the  mean  of  a 
distribution  may  not  work  when  one  is 
also  Interested  In  estimating  the  var¬ 
iance  or  higher  moments. 

(2)  Most  published  statistical  estimation 
(point  and  interval)  techniques  were 
"valid"  asymptotically,  and  were  pro¬ 
hibitively  expensive  in  terms  of  number 
of  operations  (addition  and  multipli¬ 
cation)  and  memory  cells  required. 

(3)  Most  "canned"  routines  were  slow  and 
generally  unreliable. 

(4)  "Tooling  up"  took  an  excessive  amount 
of  time,  and  storing  results,  tabu¬ 
lating  results  and  manipulating  results 
was  difficult. 

It  was  therefore  decided  to  look  into  the  proce¬ 
dures  and  algorithms  available,  program  them 
efficiently  if  they  were  useful,  develop  new 
techniques  which  were  fast  and  economical  of 
storage  where  necessary,  and  put  them  into  a 
standard  program  which  could  be  used  for  large 
scale  simulations. 


Several  guidelines  were  set: 

a)  All  procedures  were  to  be  omput at ional ly 
simple,  use  as  little  memory  as  possible 
and  to  be  as  broadly  applicable  as  pos¬ 
sible.  In  particular,  his  meant  they 
should  ui.i  as  little  information  as 
possible  about  the  statistic,  say  S,  to 
be  simulated.  For  example,  one  might  not 
want  to  use  the  specific  information  that 
S  was  positive. 

b)  To  utilize  the  speed  of  the  computers, 
the  best  way  seemed  to  be  to  compute  the 
distributions  of  as  many  statistics  as 
possible  simultaneously. 

c)  Memory  requirements  should  be  kept  fixed 
ana  relatively  small  in  order  to  use 
excess  CPU  (Central  Processing  Unit) 
time  by  running  in  a  lowest  priority  par¬ 
tition  in  a  mult iprogrammed  invironment. 

A  block  diagram  of  the  over  ill  program,  COMPSTAT, 
which  was  developed  is  sh^»n  in  Figure  1.  We  dis¬ 
cuss  this  program  generally  before  going  into 
details  of  implementation  and  unsolved  problems 
in  later  sections. 

Referring  to  Figure  1,  the  symbol  n  is  used  to 
refer  to  sample  size  in  statistical  simulations, 
so  that  the  statistic  S  might  be  the  average 
of  the  observations  in  a  random  sample  of  size  n. 
Any  value  of  n  may  be  used  in  the  program, 
though  it  is  written  so  as  to  repeat  simulations 
on  successive  values  of  n  if  required.  A  number 
m  of  replications  is  specified  by  the  user,  with 
the  option  of  splitting  m  into  r  blocks  of 
size  m'  each  (m  »  rmf).  This  is  done  to  obtain 
estimates  of  the  variance  of  estimates  and  also 
to  allow  for  checkpoints  to  be  taken. 

On  each  replication  the  STATISTICS  GENERATOR  can 
call  for  {.in  random  numbers,  unsorted  or 
sorted  by  magnitude.  The  user  writbs  the  STA¬ 
TISTICS  GENERATOR,  spc'lfying  up  to  32  statistics 
(functions  of  the  i  random  variates) .  This 
has  proved  to  be  a  very  flexible  arrangement;  the 
statistics  could  be,  for  example, 
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FLOWGRAPH  OF  COMPSTAT  PROGRAM 
Figure  I. 


a)  the  sample  serial  correlations  of  lags  1 
Co  32  in  a  series  of  random  variables 
of  length  n; 

b)  the  waiting  times  of  32  successive  cus¬ 
tomers  in  a  simulated  queue; 

c)  an  estimate  of  a  parameter  in  a  distribu¬ 
tion,  the  jackknifed  estimate  of  the 
parameter,  the  jackknifed  variance,  and 
the  pseudo-values ; 


There  are  many  other  possibilities.  For  ea 
these  statistics  the  user  can  specify  that  he  wants 
estimates  of  the  first  four  moments  of  S,  16 
quantiles  of  the  distribution  of  S,  and  16  per¬ 
centiles  of  S  (or  any  combination  of  these  three). 
Quantiles  here  is  used  to  mean  the  solution  xa 
of  the  equation  a  »  F g  (xa) ,  where  a  jji  given  and 
Fg  (x)  is  the  distribution  of  S.  A  percentile  is 
just  F(x)  for  given  x.  We  assume  the  quantile 
exists. 


d)  32  points  in  the  simulated  spectrum  of  a 
time  series  of  length  n. 
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7.  789 

8.  1  1  3 

8,  229 

8.  642 
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Exponent  ial 

1  1.8*4 
(0.  001) 
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{0.  001) 

6.  133 
(0. 005) 
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(0.  003) 

7.  068 
(0.  002) 

7.  44  5 
(0.  002) 

7.  580 
(0. 002) 

H.  063 

(0.  002) 

8  65H 
(0.  001) 

i/2  Wei  bull 

*1  3.  8*8 
(0.  011) 

85.  678 
(0,  033) 

67  *80 
(0.  067) 

72.  483 
(0.  OHO) 

80.  052 
(0.  082) 

87.  0t>8 
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(0.  032) 
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124.  384 
(0.  032) 

Skewnea  a 

Kurio*:  a 

Upper  Quant 
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X0.  999 

No  rmal 
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18,  1  76 
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20.  024 

21,  415 

23.  251 

*  1.  6  38 

Exponential 

1.  061 
(0.  003) 

*.  1*0 
(0,  0*3) 

1  5,  51 4 
(0.  004) 

17.  064 
(0,  005) 

1  8. 5’2 
(0.  005) 

19.  061 

C  Cl.  006) 

20.  552 
(0.  009) 

22.  04  5 
(0.  009) 

24. 055 
(0.  024) 

25.  557 
(0.  033 

1/2  Wetbull 

1.  557 
(0.  003) 

5.  082 
(0,  03  5) 

323.  01  2 
(0.  084) 

373.91* 

(0.  059) 

425.  81  6 
(0.  1  36) 

442. 749 
(0.  172) 

496. 882 
(0.  182) 

553.  934 
(0,  328) 

634.  068 
(0.  472) 
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(1  696) 

Table  1 


Tablo  1  shows  the  form  chosen  to  tabulate  the 
results  (moments  and  quantiles)  of  a  simulation 
Involving  m  replications  for  each  n.  These 
results  are  averages  and  sample  standard  devia¬ 
tions  of  the  results  of  the  r  blocks  of  m' 
replications,  all  of  this  being  stored  in  an 
archive  which  Tukey  has  aptly  called  the  SIDEPUT. 
The  estimated  standard  deviations  of  the  estimates 
are  given  in  brackets  just  below  the  estimates; 
below  them  we  give  (not  shown)  the  estimated 
quantiles  after  subtraction  of  the  estimated  mean 
p  and  division  by  the  estimated  standard  devia¬ 
tion  0.  This  allows  the  experimenter  to  judge 
whether  the  statistic  is  approximately  normally 
distributed. 

The  last  blocks  in  Figure  1  allow  for  CUMULATIVE 
TABULATION  on  n,  EDITING  and  SMOOTHING  of  the 
results  (including  rounding  and  printing  tables 
for  publication)  ,  and  GRAPHICAL  OUTPUT  as  shown 
in  Figures  2  A3.  It  is  easy  to  see  in  the  figures 
that  this  statistic  is  not  normally  distributed 
and  is  converging  very  slowly  with  n  to  the 
asymptotic  (ir*“)  distribution.  The  positive 


skewness  of  the  distribution  is  also  evident. 

An  original,  rather  inefficient,  COMPSTAT  program 
was  used  to  implement  a  study  of  tests  of  inde¬ 
pendence  in  point  processes.  Twenty  statistics 
were  computed  simultaneously  on  an  IBM  360/91  in 
a  120K  partition.  Some  of  these  results  have  been 
published  (Lewis,  1972)  and  are  partially  repro¬ 
duced  here  (Figures  2  and  3);  others  will  appear 
later. 

A  study  on  a  similar  scale  of  robust  estimates  of 
location  was  um’ertaken  at  Princeton  (Andrews, 
et  al,  1972);  they  had  the  advantage  over  me  of 
both  manpower  and  expertise. 

It  is  hoped  to  rewrite  the  COMPSTAT  program  at 
some  later  time  in  order  to  incorporate  all  of  the 
recent  advances  in  statistical  computing  technol¬ 
ogy  described  below. 

2 .  DETAILS 

We  discuss  now  the  details  of  the  implementation 
of  a  program  such  as  COMPSTAT.  At  its  inception 
in  1966  we  quickly  ran  up  against  the  lack  of  real 
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Figure  2 


computational  considerations  In  many  standard 
statistical  procedures.  The  situation  is  better 
at  present,  with  books  such  as  Hemaerle  (1967) 
and  Knuth  (1969)  now  available.  Knuth  (1969)  in 
particular  la  Invaluable.  There  are  still  many 
problems,  however,  particularly  relating  to  large- 
scale  computations. 

a)  Random  Number  Generation 

Clearly  the  statistical  quality  of  the  random 
numbers  available  for  large-scale  simulations 
will  be  the  limiting  factor  In  how  far  one  can  go 
In  utilizing  large-scale  computers  in  simulations. 
In  1966  the  main  generator  In  use  was  RANDU  in 
the  IBM  SSP  package.  It  Is  still  widely  used  to¬ 
day,  by  default,  even  though  it  Is  known  to 


knowledgeable  users  to  have  poor  statie  leal  pro¬ 
perties.  There  are  no  published  test  i  suits  on 
RANDU,  except  one  brought  to  my  attenr'  n  at  the 
conference  (Bates  and  Zirkle,  1971)  but  there  are 
papers  published  on  problems  which  have  been  en¬ 
countered  with  its  use.  Moreover,  as  a  statistical 
consultant  one  comes  up  against  many  cases  In  which 
strange  results  in  simulations  are  remedied  by  re¬ 
placing  RANDU  by  another  random  number  generator. 

In  this  respect  it  might  be  noted  that  if  statis¬ 
ticians  are  guilty  of  ignoring  computational 
aspects  of  their  procedures,  computer  scientists 
are  equally  guilty  of  ignoring  the  statistical 
aspects  of  algorithms.  There  are  hundreds  of 
clever  random  number  algorithms  in  the  literature 
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Figure  3 


[see  the  bibliography  by  lienee  and  Overstreet, 
1972]  but  there  are  virtually  no  accompanying  test 
results  published.  Horeover,  It  is  generally  dif¬ 
ficult  to  do  so  In  a  computer  Journal. 

In  1966  we  started  to  Investigate  the  problem  of 
random  number  generation  for  large-scale  computa¬ 
tion,  and  after  extensive  testing  we  developed 
a  31-bit  pseudo-random  number  generator  for  the 
System  360.  This  is  a  multiplicative  congruential 
generator  of  the  form  (Lewis,  Goodman,  and  Miller, 
1969) 

J-1+1  =  Axt  (mod  p)  , 

31  5 

where  p,  a  prime,  is  2  1  and  A  -  7  is  a 


positive  primitive  root  of  p,  thus  guaranteeing 
a  cycle  of  length  p  for  the  generator.  Another 
advantage  of  using  a  positive  primitive  root  for 
the  multpller  is  that  low  order  bits  are  also 
random.  Beside  the  assembly  language  version  given 
in  the  paper,  a  very  fast  version  of  this 
generator  which  generates  arrays  of  integer  or 
floating  point  numbers  is  available  in  the  new  IBM 
SL/MATH  package.  We  will  refer  to  this  as  the  GGL 
generator;  it  generates  a  random  number  in  1.40 
Usees  on  a  360/91  and  in  16.00  usees  on  a  360/67. 

A  version  has  been  written  for  the  360/67  at  the 
Naval  Postgraduate  School  using  a  division  simu¬ 
lation  algorithm  due  to  Lehmer  (see  Payne,  Rabung, 
Bagyo,  1969;  Liniger,  1961). 
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Test  results  for  GGL  are  given  in  Lewis,  Goodman, 
and  Miller  (1969)  and  extensive  subsequent  use 
turned  up  no  obvious  problems,  though  constant 
care  was  exercised.  For  example,  in  Table  1  the 
maximum  perlodogram  values  with  1/2-Weibull  dis¬ 
tributed  variates  (squares  of  exponentially  dis¬ 
tributed  ' iriates)  are  very  large.  In  this  simu¬ 
lation  r  «  6,  m'  »  750,000  and  m  "  4,500,000. 

A.  ft  check,  normal  deviates  were  used  in  a  very 
large  simulation  and  no  discrepancy  from  the 
exact  distribution  (shown  in  Table  1)  even  at 
0.999  quantiles  was  found.  This  GCL  random 
numoer  generator  was  used  in  the  Princeton  study 
(Andrews,  et  al,  1972)  and  is  used  in  APL. 

Nevertheless,  valid  doubts  continue  to  be  expres¬ 
sed  about  the  use  of  the  GGL  generator  in  large- 
scale  computations,  particularly  in  light  of 
Marsaglla's  results  (Marsaglia,  1968,  1972)  on  the 
structure  of  sequences  of  numbers  from  congruen- 
tial  generators.  These  results  have  shed  a  lot 
of  light  on  problems  which  can  be  encountered 
with  congruential  generators,  but  I  don't  be¬ 
lieve  they  tell  the  whole  story.  There  are  also 
new  types  of  generators  being  advocated.  Some 
of  these  arc  too  cumbersome  for  consideration, 
but  others  are  popular,  in  particular  the  Taus- 
worthe  or  shift  register  generators  (Tausworthe, 
1965).  Some  doubt  has  been  cast  on  the  statis¬ 
tical  properties  of  these  shift  register  pseudo¬ 
random  number  generators  recently;  my  own 
preference  is,  for  speed  and  simplicity,  to  go  to 
shuffled  congruential  generators  (Marsaglia  and 
Bray,  1968).  Tukey  (1972)  ascribes  this  l  ea 
to  Gentlemen  but  it  seems  quite  old  and  was  put 
forward  by  Marsaglia  in  the  early  1960's.  I 
have  found  no  documentation  of  the  statistical 
properties  of  shuffled  generators,  although  they 
are  intuitively  appealing. 

We  have  undertaken  further  statistical  tests  of 
some  of  the  above  generators  at  the  Naval  Post¬ 
graduate  School.  In  particular,  we  have  been 
Interested  in  correlating  test  results  with 


results  of  Couveyou  and  MacPherson,  1967;  (see 
also  Knuth,  1969,  pp.  82-100.)  and  Marsaglia,  1972. 
It  seems  to  me  that  the  Couveyou-hacPherson  Fourier 
analysis  is  the  best  analytical  tool  for  predicting 
performance  of  random  number  generators  that  has 
appeared.  Marsaglla's  results  [1972]  on  the  lat¬ 
tice  structure  of  the  congruential  generators  are 
also  useful. 

The  tests  referred  to  started  with  the  GGL  random 
number  generator,  RANDU  and  a  TAUSWORTHE  generator 
(Tausworthe,  1965)  auu  the  runs  test.  As  in  Lewis, 
Goodman,  and  Miller  (1969),  runs  of  length  eight 
or  longer  are  pooled  and  a  Chi-square  statistic 
computed.  Nominally,  this  has  a  Chi-square  dis¬ 
tribution  with  7  degrees  of  freedom  and  we  denote 
2 

it  as  y  .  Table  2  gives  summary  rcaiistics  on 
16 

20  runs  of  2  numbers  each  for  the  three  gener- 
rlors.  The  Tausworthe  generator  is  now  analytic¬ 
ally  known  to  have  poor  runs  performance  (Tooth! 11, 
Robinson,  and  Adams,  1971).  The  runs  test  rejects 
neither  GGL  nor  RANDU  if  the  test  statistic  is  as¬ 
sumed  tc  be  distributed  as  a  Chi-square  variate 
with  7  degrees  of  freedom.  As  mentioned  before, 
RANDU  is  known  to  be  poor;  this  is  shown  in  Table 
3  giving  the  Couveyou-MacPherson  wave  numbers  for 
GGL  and  RANDU  for  dimensions  up  to  7.  It  is  in 
higher  dimensions  that  RANDU  is  particularly  poor^ 

Table  2.  Runs  test;  Chi-square  statistic. 


o  j 


GGL 

6.846 

4.084 

RANDU 

7.939 

3.502 

TAUSWORTHE 

9.972 

10.428 

Table  3.  Wave 

numbers  for 

two  generators.* 

GGL 

RANDU 

Dimension 

2 

16,807 

23,172 

3 

638.9 

10.86 

4 

146.25 

10.77 

5 

67.21 

10.77 

6 

29.92 

7 

16.55 

fP» I ! ■' ' w  '■■Ul  ,u» 


*  I  am  indebted  to  Dr.  L.  R.  Turner,  NASA  Lewis  Research  Centei ,  Cleveland,  Ohio 
for  these  numbers. 
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A  comparison  of  the  three  samples  of  size  20  using 

a  two-sample  Kolmogorov  test  rejects  the  hypothesis 

that  any  of  them  are  from  the  same  distribution! 

This  is  a  sad  result  for  large-scale  simulation, 

particularly  If  one  were  trying  to  simulate  the 

distribution  of  the  Chi-square  summary  statistic, 

2 

Xy,  for  the  runs  test. 

The  three  generators  have  been  shuffled  and  the 
Chi-square  statistics  of  100  tests  for  samples 
of  size  2^  numbers  from  each  generator  were 
found  to  be  dlstrlbutionally  commensurate.  The 
shuffled  Tausworthe  generator  was  still  suspect, 
however. 

Results  of  this  testing  will  be  given  elsewhere; 
other  statistical  tests  are  being  evaluated.  The 
conclusions  so  far  are  interesting.  There  is 
mild  evidence  that  shuffling  helps.  The  main  con¬ 
clusion  seems  to  be,  however,  that  the  runs  test 
is  virtually  useless.  (Note  that  Bates  and  Zlrkle 
accept  RANUU,  partly  on  the  basis  of  runs  tests.) 
And  although  recent  books  (Newman  and  Odell,  1971; 
Maisel  and  Gnugnoli,  1972)  tout  the  runs  test  as 
amongst  the  best,  I  have  been  unable  to  find  any 
documentation  for  this.  It  seems  to  be  an  example 
of  a  stochastic  rumor  to  which  I  too  have  contri¬ 
buted  (Lewis,  Goodman,  and  Miller,  1969).  Perhaps 
some  readers  can  guide  me  to  work  on  the  power 
of  the  runs  test;  Lehman  (1959,  p.  155)  points 
out  that  a  modified  runs  test  has  certain  optimum 
propertied  in  testing  for  independence  in  a  binary 
sequence  against  lst-order  Markov  alternatives. 

Even  results  on  the  power  of  the  runs  test  re¬ 
lative  to  the  serial  correlation  test  for  first 
order  normal  autoregressive  schemes  would  be  of 
interest . 

There  is  clearly  much  work  to  be  done  in  random 
number  generation.  I  have  also  not  mentioned 
the  need  for  efficiently  generated,  reliable  nor¬ 
mally  distributed  random  variates.  These,  and 
perhaps  several  other  kinds  of  random  deviates 
should  be  provided  as  primitives,  Just  as  random 
numbers  are  provided  as  primitives  in  APL.  A 
package  to  generate  normally  and  exponentially 
distributed  random  variables  is  available  from 


Marsaglia  at  McGill  University.  It  uses  some  of 
Marsaglla's  own  methods  and  is  very  fast.  A  survey 
of  some  of  these  methods  is  given  by  Ahrens  and 
Dieter  (1972). 

b)  Ordering. 

Ordering  (sorting)  of  quantities,  or  obtaining 
ranks,  is  a  basic  operation  in  statistical  compu¬ 
tation;  a  survey  is  given  by  Martin  (1971).  The 
main  use  we  had  for  it  initially  was  in  quantile 
estimation,  and  here  it  was  a  bottleneck  since, 
in  general,  ordering  of  n  quantities  takes  a 
number  of  operations  proportional  to  n(£.n  n)  and 
memory  capacity  proportional  to  n.  The  quantile 
estimation  problem  is  discussed  below;  the  use  of 
ordering  is  now  used  mainly  in  COMPSTAT  in  gener¬ 
ating  statistics  such  as  the  median.  There  ere 
severs!  points  to  be  made  here. 

(i)  Uniformly  distributed  random  variates 
can  be  ordered  by  address  modification  schemes 
(Isaac  and  Singleton,  1956)  in  time  propor¬ 
tional  to  n,  although  for  large  computations 
3n  memory  positions  aie  needed  to  avoid  over¬ 
flows.  An  algorithm  for  this  type  of  sorting 
is  provided  in  COMPSTAT. 

(ii)  It  is  clear  that  by  uBing  pilot  estimates 
of  a  non-uniform  distribution,  address  modi¬ 
fication  schemes  can  be  used  on  any  data. 

These  take,  asymptotically,  n  operations, 
but  for  reasonable  sample  sizes  the  procedure 
is  slow  and  cumbersome  programming-wise.  The 
scheme  is  due  to  Floyd  at  Stanford.  There  is 
renewed  interest  in  this  area  and  Chambers 
(1971)  has  a  scheme  for  partial  sorting  which 
is  more  efficient  than  an  n((n  n)  sort. 

Andrews  (personal  communication)  also  has  a 
scheme  for  obtaining  the  median;  it  uses  a 
pilot  estimation  scheme  and  is  subject  to 
overflows  which  could  be  a  problem  in  large- 
scale  simulations. 

(iii)  Schemes  using  the  Markov  property  of 
the  gaps  (differences  between  successive  order 
statistics)  (see  David,  1971,  p.  17)  are 
available  for  producing  ordered  uniform  var¬ 
iates  (Schucany,  1972;  Lurie  and  Hartley,  1972). 


8 


(i  -  1 ,  . .  .n) 


We  had  tried  In  COMPSTAT  an  equivalent  scheme 
based  on  the  Independence  of  the  gap  statis¬ 
tics  for  exponentially  distributed  variates. 
These  schemes  for  moderate  n  are  more  time 
consuming  than  the  n(log  n)  schemes,  but  are 
more  efficient  in  use  of  memory  space.  Their 
primary  use  would  seem  to  be  when  only  a  few 
of  the  low  or  high  order  statistics  are  needed. 
Two  points  should  be  made  here: 

1.  Ic  is  computationally  easier  to  generate 

high  order  statistics,  rather  than  the  low 
order  statistics  advocated  by  Lurie  and 
Hartley  (1972)  and  Shucany  (1972).  Denoting 
the  uniform  variates  by  and  the  ordered 

uniform  variates  by  U^,  we  have 

prot>{  U(u  *  u(1)l»1+1  '  »(ltl)}  ‘ 

(1-1,2,. ..n;  u(1)  s  u(1+1),  Un+1  =  1) 

If  only  low  order  uniform  order  statistics  are 
required,  they  are  generated  as  *  1  - 

U(n+l-i)  ' 

2.  The  time  consuming  operation  in  the  above 
is  to  take  the  (l/i)th  power.  This  is  done, 
usually,  using  logarithms  and  the  scheme  is 
then  equivalent  to  generating  order  statistics 
from  a  unit  exponential  distribution.  However, 
since  it  is  much  faster  to  generate  exponen¬ 
tial  var lates  using  some  of  Marsaglla's  sam¬ 
pling  procedures  than  it  i£  u>  generate  them 

by  taking  the  logarithm  of  jr  uniform  variate , 
it  is  faster  to  generate  ordered  uniform  ran¬ 
dom  numbers  by  starting  with  exponential  var¬ 
iates. 


The  basis  for  this  is  that  if  E^,  *  “  1»  2, 
...n,  denotes  ordered  unit  exponential  vari¬ 
ates  from  a  sample  of  size  n,  and  we  let 
E(0'  *  t*le  8aP  statistics  (Cox  and  Lewis, 
p.  26-27) 


D(i)  “  E(i)  "  E(i+1) 


(i  -  1,  n) 


are  Independent  exponentials  with  mean 
E(D ^i) )  -  (iri-l-i)  1.  Thus  if  we  have  n  unit 
exponentials,  generated  say  by  one  of  Marsag¬ 
lla's  schemes,  we  generate 


i 

E  -  I  (n+l-j )  E 
K  1  j-1 

and 

U(i)  -  1  -  exp  lE^}  (i  »  1,  ...n). 

An  n ( log  n)  sorting  and  ranking  scheme  is  also 
provided  in  COMPSTAT,  for  sorting  and  ordering  with¬ 
in  the  STATISTICS  GENERATOR. 

c)  Quantiles  and  Percentiles. 

Estimating  quantiles  was  the  second  biggest  bottle¬ 
neck  in  implementing  COMPSTAT.  Quantiles  are  more 
b;  oic  in  characterizing  distributions  than  percen¬ 
tiles,  although,  for  example,  one  is  Interested  in 
percentiles  when  evaluating  by  simulation  the  power 
of  a  test  based  on  a  statistic  S.  Thus,  given 
the  a-quantlle  x  of  under  a  null  hypothesis, 
one  wants  the  percentile  corresponding  to  S  and 
x^  under  a  different  hypothesis. 

Percentile  estimation  as  a  binomial  process  is  es¬ 
sentially  straightforward  and  Ideal  by  our  criter¬ 
ion  of  simplicity  and  economy  of  computation  and 
memory  requirements.  It  is  also  unbiased.  However, 
it  appears  that  greater  efficiency  should  be  ob¬ 
tained  by  coupling  estimates  at  different  x^'s, 
although  I  haven't  been  able  to  do  so.  Most  schemes 
appear  to  require  assumptions  about  boundedness  of 
the  probability  density  function.  Somerville  (1970) 
has  some  results  in  this  area;  it  appears  to  be  an 
area  for  further  research. 

Quantile  estimation  based  on  order  statistics  is 
advocated  in  most  texts  (see  David,  1971).  For 
large-scale  computation  the  sorting  time  required 
and  the  memory  capacity  is  prohibitive.  Stochastic 
approximation  schemes  (Robbins  and  Monro,  1951; 
Hodges  and  Lehman,  1956)  were  then  tried  but  found 
to  converge  at  an  impossibly  slow  rate  for  large 
quantiles.  These  two  quantile  estimation  schemes 
are  prime  examples  of  statistical  procedures  which 
are  not  attuned  to  computing  realities,  and  whose 
asymptotic  properties  are  deceptive  as  far  as  prac¬ 
tical  applications  are  concerned. 

A  solution  was  finally  found  (Goodman,  Lewis,  and 
Robbins,  1972)  which  combined  the  stochastic  ap¬ 
proximation  with  a  data  transformation.  Typically 
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If  Che  cx-quantile  was  required  (a>0.5),  the 
maxima  of  successive  groups  of  size  v  of  reali¬ 
zations  of  S  are  found.  The  problem  Is  then 
one,  If  u'  -  av,  of  finding  the  x^,  quantile, 

which  Is  equal  to  x  ,  In  a  distribution  which 
t  h  ^ 

is  the  v —  power  of  the  distribution  S,  Fg(x). 
By  caking  v  large  enough  to  make  a'  ~  1/2  the 
problem  becomes  one  of  estimating  a  median,  al¬ 
though  other  values  of  v  can  be  used.  Stochastic 

approximations  work  well  with  medians,  but  as  the 

-1/2 

bias  Is  apparently  of  order  m  ,  jackknifing 
Is  required  to  reduce  the  bias. 

The  present  scheme  (Goodman,  Lewis,  and  Robbins, 
1972)  based  on  the  maximum  transformation  and 
stochastic  approximation  solves  the  basic  pro¬ 
blems  of  quantile  estimation,  but  research  is  con¬ 
tinuing  to  improve  it.  Computationally  it  is  very 
good  since  finding  a  maximum  requires  only  two 
memory  cells  and  computation  time  ia  linear  in  m, 
the  number  rif  realizations  of  S  which  are  gen¬ 
erated.  It  is  also  simple  to  compute  in  parallel 
the  quantiles  for  several  levels,  e.g.  a  * 

0.990,  0.995,  0.999. 

D.  Salsburg  has  raised  the  question  as  to  whether 
one  wouldn't  want  to  order  the  data  anyway  to  do, 
for  instance,  a  normal  probability  plot  of  the 
simulated  distribution.  This  may  be  true  for 
samples  of  size  m  equal  to  about  500;  beyond 
that  the  sorting  in  a  large-scale  simulation  be¬ 
comes  onerous,  time-wise  and  memory-wise,  and  I 
feel  a  plot  using  16  quantiles,  as  in  Figure  2, 
plus  the  moments  in  Figure  3,  is  as  good  as  or 
better  than  a  full  probability  plot. 

d)  Bias  and  Bias  Reduction. 

It  is  essential  for  sensible  and  interpretable 
simulation  results  to  have  estimates  of  the  var¬ 
iances  of  the  simulated  quantities.  However, 
sectioning  the  m  replications  in  a  large-scale 
simulation  into  r  sections  of  m'  replications 
to  estimate  the  variance  of  estimates  (see 
Mosteller  and  Tukey,  1968)  brings  in  problems  of 
bias.  This  is  because  one  wants  r  to  be  about 
10  to  get  reliable  estimates  of  the  variance,  but 
the  rr  uiltlng  m'  may  be  too  small  to  reduce  the 


bias  in  the  simulated  quantity  to  acceptable  levels. 

This  problem  seems  to  be  well  In  hand  because  of 
the  jackknife  technique  for  bias  reduction  which 
was  developed  by  Quenouille  (1956),  pushed  by 
Tukey  (1958)  and  generalized  by  Schucany,  Gray, 
and  Owen  (1971)  and  Gray  and  Schucany  (1972).  A 
similar  technique  was  used  by  Caver  and  Hoel  (1970) 
in  examining  small-sample  Poisson  probability  es¬ 
timates.  Some  price  may  be  paid  in  inflation  of 
the  variance  of  the  estimator  (Miller,  j.964;  also 
Goodman,  Lewis,  and  Robbins,  1972  ,  for  a  specific 
case) . 

In  C0MPSTAT  the  jackknife  is  quite  simply  incor¬ 
porated  into  the  STATISTICS  GENERATOR. 

e)  Variance  Estimation. 

The  problem  of  bias  appears  to  have  been  alleviated 
directly  by  the  jackknife,  and  indirectly  because 
of  a  suggestion  by  Tukey  (1958)  that  the  sample 
standard  deviation  based  on  the  pseudo-values  in 
the  jackknife  procedure  be  used  to  estimate  the 
variance  of  the  jackknifed  estimate.  There  is 
some  evidence  that-  this  procedure  is  broadly 
applicable,  although  Miller  (1968)  pointed  out 
cases  where  it  can  give  poor  results.  In  general, 
n-fold  Jackknifing  in  a  small  sample  of  size  n 
can  give  an  estimate  with  a  very  inflated  variance, 
though  this  problem  disappears  as  n-*°°.  Relevant 
references  are  Arvesen  (1969),  a  review  by  Arvesen 
and  Salsburg  (1972),  and  Mosteller  and  Tukey  (1968). 

The  lackknifing  procedure  will  probably  be  most 
useful  when  available  computation  time  is  too  short 
for  sectioning.  For  a  description  of  variance 
estimation  techniques  based  on  sectioning,  see 
Mosteller  and  Tukey  (1968). 

f )  Variance  Reduction  Techniques. 

I  have  not  discussed  variance  reduction  techniques 
so  far.  An  excellent  review  is  given  by  Gaver 
(1969);  oee  also  Hammersley  and  Handscomb  (1964). 
These  variance  reduction  techniques  can  be  imple¬ 
mented  in  COMPSTAT  but  there  seem  to  be  several 
drawbacks,  mainly  that  the  methods  are  particular 
to  the  problems  at  hand.  Thus,  a  large  amount  of 
time  can  be  spend  deriving,  say,  an  antithetic 
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variate  for  a  particular  problem  and  this  may,  when 
lar«'  computers  are  available,  be  an  Inefflciert 
way  to  use  statisticians. 

The  most  important  drawback  to  most  methods,  how¬ 
ever,  is  that  a  method  that  reduces  the  variance 
of  an  estimate  of  the  mean  of  a  statistic  S  will 
often  inflate  the  variance  of  an  estimate  of  the 
variance  of  S.  This  is  clearly  true  for  many 
antithetic  variate  techniques  (Hammersley  and 
Mauldon,  1956)  and  would  be  worse  when  quantiles 
or  percentiles  are  also  required.  This  may  be 
all  right  in  nearly  normal  situations,  but  not  in 
others. 

Much  more  research  is  required  on  variance  reduction 
techniques  that  are  applicable  to  all  aspects  of 
the  characterization  of  a  distribution,  and  are 
easily  derived.  Control  variable  techniques 
(Fieller  and  Hartley,  1959)  seem  to  me  the  best 
candidate  for  this  role. 

An  empirical  control  variable  technique  can  be  im¬ 
plemented  with  COMPSTAT  when  exploration  is  re¬ 
quired  around  a  null  situation.  This  may,  for 
instance,  be  a  test  of  hypothesis  in  which  power 
against  small  deviations  is  of  interest.  Again, 
small  variations  in  scheduling  jorithms  in  com¬ 
plex  queues  might  be  of  interest  to  see  what 
improvement  they  make  to,  say,  throughput  time. 

One  might  then  do  a  very  precise  simulation  of  the 
characterizations  of  the  statistic  under  the  null 
hypothesis.  Fix  m',  the  number  of  replications 
per  section,  and  let  r,  the  number  of  sections, 
be  large  and  denote  by  (^(r)  the  estimated 
quantity  under  the  null  hypothesis.  This  will  be 
the  average  of  the  estimates  of  ^  from  the  r 
sections.  Results  for  the  sections  are  kept  in 
the  SIDEPUT,  together  with  the  seed  for  the  random 
number  generator  which  initiates  each  section  of 
the  simulation.  The  quantity  is  estimated  under 
alternative  conditions  using  the  same  random  num¬ 
bers  using  only  r'  sections,  where  r'  <<  r. 

Call  this  quantity  6£(r').  If  f^(r')  Is  the 
null  (average)  estimate  from  the  first  r'  sec¬ 
tions,  f^(r-r')  the  null  (average)  estimate  from 
the  last  r  -  r'  sections,  then  the  control 


variable  estimate  is 

5(r')  -  0E(r')  -  ^(r')  +  0Q(r) 

-  0e(r*)  -  (r-^)  ^j(r')  +  (£=-;-)  Tb(r-r' 

Then  _  _ 

var[  t.  (  r ')  |  -  var[  ^(r’)]  +  (^-)  var[  ^(r')J  - 

(^covlFtr'J^r'))  +  <L--L,.)var(^(r-r')]. 


The  common  random  numbers  used  to  generate  the  es¬ 
timates  should  make  the  estimates  ^(r')  and 
f^r')  highly  correlated,  and  the  above  equation 
is  the  variance  in  the  usual  control  variable  sit¬ 
uation  except  for  the  last  tern'.  If  r  Is  large 
relative  to  r'  this  last  term  should  be  small 
relative  to  the  other  terms. 


It  is  possible  to  use  subsequent  sections  of  size 
r'  in  the  original  simulation  of  to  explore 

other  alternatives,  say  6  ,  f.  ,  ....  There  are 

Ei  t2 

interesting  design  and  analysis  problems  in  this 
scheme  which  will  be  explored  elsewhere. 


One  final  point  should  be  made  here  about  control 
variables.  Let  6  be  the  uncontrolled  estimate 
and  f  the  controlled  estimate  (generated  from 
the  same  random  numbers).  It  is  not  often  real¬ 
ized  that  even  with  a  regression  adjusted  control 
(see  Gaver,  1969)  the  maximum  attainable  variance 
reduction  is 


var  (  E ) 
var  ( 0) 


where  o  is  the  correlation  between  5  and  6. 

It  can  be  very  difficult  and  time-consuming,  es¬ 
pecially  for  the  inexperienced  practitioner,  to 
find  a  control  which  gives  a  high  enough  p  to 
justify  the  pratitioners  time.  And  in  many  cases 
equivalent  speed  ups  can  be  achieved  by  using  more 
efficient  random  number  generators,  ordering  rou¬ 
tines,  etc. 


The  time  factor  to  achieve  a  high  p  is  one  rea¬ 
son  for  putting  forward  the  empirical  scheme  above 


g)  Planning  Simulation  Experiments. 

The  empirical  control  variable  suggestion  in  the 
previous  section  brings  up  the  whole  question  of 
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the  design  of  simulation  experiments,  Thus,  it 
would  be  reasonable  to  use  the  empl  leal  scheme, 
or  plan  an  experiment  around  the  null  value  f^? 
This  would  be  appropriate  if  the  range  of  para¬ 
meters  of  interest  were  known  In  advance.  The 
empirical  control  variable  technique  seems  at¬ 
tractive  as  an  on-line,  interactive  procedure, 
especially  when  estimates  of  the  variances  of  the 
estimates  are  available,  as  in  COMPSTAT.  Some 
formal  analysis  is  still  needed  and  this  could 
be  formidable. 

In  general,  it  would  seem  that  the  output  of 
large-scale  simulation  would  be  a  fertile  field 
for  application  of  techniques  of  analysis  of  var¬ 
iance  and  experimental  design.  1  am  not  familiar 
with  much  by  way  of  specific  applications;  several 
recent  books,  including  that  by  Mihram  (1972), 
which  I  have  not  examined  carefully,  do  treat 
analysis  of  simulation  experiments.  The  tendency, 
however,  does  seem  to  be  to  Just  regurgitate  the 
old  theory  without  specifically  worrying  about 
particular  problems  of  simulation  experiments. 

A  simple  case  occurs  when  an  experimenter  has  two 
variance  reduction  techniques  available,  say  two 
control  variables,  and  a  fixed  number  of  repli¬ 
cations  m  he  can  perform.  He  wants  to  choose 
the  control  variable  which  minimizes  the  variance 
of  the  final  estimate  of  a  parameter,  say  9, 
which  could  be  the  mean  of  a  statistic  S.  If 
tr.'  is  large  enough  so  that  the  estimates  in  each 
of  the  r  sections  (r*'«m)  are  unbiased  and 
normally  distributed,  this  is  a  classical  two  arm 
bandit  problem. 

I  know  of  no  one,  however,  who  has  actually  done 
this,  probably  because  the  benefit  of  reduced 
variance  doesn't  outweigh  the  extra  cost  of  tool¬ 
ing  up  for  two  estimates  of  G.  It  could  be 
feasible  with  COMPSTAT.  Once  more  than  one  para¬ 
meter  is  involved,  say  the  mean  and  variance  of 
S,  the  problem  is  much  more  complicated.  In 
general  I  think,  however,  that  as  computers  de¬ 
velop  simulation  will  make  many  statistical  prac¬ 
tices  developed  in  vacuo  widely  useful. 

Some  of  many  other  open  design  problems  can  be  , 


seen  by  considering  Figure  2,  where  estimated  quan¬ 
tiles  of  a  distribution  are  plotted.  One  would 
generally  want  to  smooth  these  plots  or  fit  some 
regression  function  to  assess  the  rate  of  conver¬ 
gence  to  the  asymptotic  normal  distribution.  There 
are  problems  in  that  the  number  of  simulations,  m, 
was  fixed  in  advance  and  thus,  the  variances  at 
each  n  vary.  Moreover,  one  would  want  to  couple 
the  smoothing  or  regression  analysis  of  the  var¬ 
ious  quantiles.  These  are  both  functionally  and 
statistically  correlated  for  each  n  across 
quantiles  and  with  n  for  each  quantile. 

Detailed  analysis  of  such  graohic  output  needs 
much  more  work;  it  is  possible  that  the  work  of 
Efron  and  Morris  (1972)  may  be  relevant  to  this 
problem. 

Besides  the  smoothing,  any  program  such  as  COMPSTAT 
should  provide  facility  for  direct  plotting  of 
output  tables  of  rounded  and  perhaps  smoothed  data. 
This  is  one  facility  computer  scientists  can  pro¬ 
vide  us  with. 

3.  MISCELLANEOUS  PROBLEMS  AND  OPEN  QUESTIONS. 

I  have  not  touched  on  many  questions  in  large-scale 
simulation.  A  few  are  discussed  here  to  emphasize 
that  there  are  many  problems  that  do  not  even  start 
to  fit  on  present  or  future  computers.  Thus,  sim¬ 
ulation,  especially  without  some  analytic  support, 
is  not  always  a  possible  way  out  of  problems,  al¬ 
though  some  people  feel  simulation  is  the  last 
resort.  Other  questions  discussed  below  indicate 
that  there  are  simple  problems  we  cannot  handle. 

a)  Conditional  distributions. 

Conditioning  poses  problems  in  simulations  which 
I  do  not  know  how  to  handle  efficiently.  Thus,  in 
fitting  exponential  polynomials  to  data  from  a  non- 
homogeneous  Poisson  process  (Lewis,  1972)  observed 
for  a  time  tp,  one  wants  to  condition  on  the 
number,  n,  of  events  observed  in  (0,tp).  The 
times  to  events  are  then  order  statistics 

from  a  uniform  random  sample  of  size  n.  In  test¬ 
ing  for  a  second  order  term  in  the  polynomial  one 
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wants  the  conditional  distribution  of  It^,  given 
n  and  Et^.  Conceptually  this  is  simple  to  see, 


» 


« 


as  lt^  is  the  distance  from  the  origin  to  the 

n  -  1  dimensional  hyperplane  defined  by  fixing 
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~tj.  The  Joint  asymptotic  normality  of  It^  and 
lt^  give  tie  result  that  for  large  n,  Tt|/n 
has  a  conditional  normal  distribution  with  mean 


(Lewis,  1972). 
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and  standard  deviation 

1  co 
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Hou  does  one  simulate  this  problem  for  small  n 
and  assess  the  rate  of  convergence  to  n?  This 
must,  be  a  very  common  problem. 


b)  Multivariate  problems. 


I  have  not  mentioned  simulation  of  multivariate 
statistics  S.  An  immediate  problem  here  is  that 
quantiles  and  percentiles  are  not  uniquely  defined, 
so  one  has  to  use  joint  moments,  which  could  be 
estimated  in  COMPSTAT,  or  rely  on  probability 
density  functions,  I  have  not  discussed  density 
estimation  here  at  all.  Multivariate  problems, 
of  course,  also  bring  in  new  aspects  of  graphical 
and  tabular  output  which  are  non-trivial. 


c)  Simulated  maximum  llkellhocd. 


As  a  last  stab,  I  would  like  to  mention  another 
area  which  interests  me.  In  complicated  time 
series  we  now  have  computationally  feasible  tools 
such  as  spectral  analysis  to  help  in  defining  and 
delineating  models.  Once  this  is  done,  however, 
there  are  often  no  reasonable  ways  of  estimating 
parameters  of  the  model,  especially  since  likeli¬ 
hoods  cannot  be  derived,  even  though  the  model  is 
structurally  simple.  It  would  be  useful  to  simu¬ 
late  the  Joint  density  of  the  observations  at  the 
observed  data  point  as  a  function  of  the  para¬ 
meters  so  as  to  find  the  maximum  likelihood  es¬ 
timates  of  the  parameters.  I  assume  this  ie  worth 
the  cost  to  the  experimenter.  One  then  has  a  more 
complicated  case  of  a),  closely  related  to  re¬ 
sponse  surface  designs.  The  solution  seems  to  be 
far  away. 


The  reader  is  referred  to  the  papers  by  Tukey 


(1972  a,  b)  for  further  problems. 
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